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ABSTRACT 

This  study  describes  a  computerized  item-selection 
program  called  PAIN  that  uses  a  pattern-analysis  approach 
to  select  a  most-valid  subset  of  items  from  a  set.  The 
results  of  this  study  indicate  that  PAIN  is  capable  of 
selecting  a  small  subset  of  items  which,  when  scored  by 
pattern  analysis,  has  greater  validity  than  the  original 
set.  It  appears  that,  as  well  as  reducing  the  sizes  of 
standard  tests  without  losing  predictive  value,  PAIN  may 
also  be  of  value  in  selecting  biographical  items  of  infor- 
mation for  use  as  predictors. 
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I.  INTRODUCTION 

Testing  has  played,  and  in  all  likelihood  will  continue 
to  play,  a  major  role  in  the  classification  and  selection 
processes  of  both  industry  and  the  military.  The  vast  a- 
mounts  of  time  and  money  expended  in  this  area  warrant 
investigation  of  any  methods  which  might  increase  the  effi- 
ciency of  the  techniques  involved  or  improve  the  end 
results. 

It  was  the  objective  of  this  study  to  investigate  one 
such  method, 

II.  NATURE  OF  THE  PROBLEM 

Testing  has  played  an  important  role  in  the  military 
system  of  classification  and  placement  for  many  years, 
Basic  schooling  assignments  and  eligibility  for  promotion 
are  just  two  of  the  more  important  areas  that  have  been 
greatly  influenced  by  testing  and  test  interpretation.  Yet, 
on  many  occasions  the  critical  time  element  involved  in 
testing  and  the  lack  of  quality  information  available  from 
tests  have  combined  to  make  classification  and  placement  a 
haphazard  affair. 

The  U.S.  Navy  has  taken  steps  to  reduce  the  magnitude 
of  the  testing  problem  by  the  development  of  a  computer 
program  called  SEQUIN  (SEQUential  Xtem  Nominator).  As  a 
result  of  the  use  of  SEQUIN  it  has  been  shown  not  only  that 


the  size  of  a  test  may  be  reduced  without  loss  of  validity, 
but  also  that  validity  may  actually  be  increased  by  using  a 
specially  selected  subset  of  questions  from  the  original 
test.  In  some  cases  as  few  as  seven  items  from  a  test  were 
found  to  provide  information  equal  to  or  better  than  that 
provided  by  the  complete  test.  This  being  the  case,  it  ap- 
peared that  pattern  analysis  of  a  few  selected  items  from  a 
test  might  be  feasible. 

The  problem  of  using  pattern  analysis  on  a  test  even  as 
small  as  30  items  is  one  of  shear  size,  A  test  of  30-item 
size  yields  over  a  billion  possible  patterns.  The  evalua- 
tion of  this  number  of  patterns  is  a  formidable  job  for 
even  a  computer,  not  to  mention  the  problem  involved  with 
interpretation  of  individual  results  once  all  the  patterns 
have  been  evaluated.  In  fact,  in  order  to  establish  a  pre- 
dictor value  for  each  pattern  that  could  be  encoimtered,  at 
least  a  billion  subjects  would  have  had  to  already  have 
taken  the  test  under  consideration. 

Reducing  the  size  of  a  test  to  seven  items  means  that 
only  128  patterns  have  to  be  analyzed.  The  number  of  pat- 
terns involved  is  found  by  raising  the  number  2  to  the 
power  indicated  by  the  number  of  items  in  the  test,  A  sub- 
set of  seven-item  size  would  thus  be  suitable  for  pattern 
analysis. 

The  objective  of  this  study  was  to  devise  and  evaluate 
a  method  of  selecting  items  from  a  test  that  would  optimize 
the  validity  of  the  subset  selected  when  scored  by  pattern 
analysis. 


III.  DEVELOPIVIENT  OF  A  SOLUTION 

A.  SUBJECTS  OF  THE  STUDY 

The  records  of  approximately  2,400  U.S.  Navy  enlisted 
men  who  had  attended  the  Electronics  Technician  School  at 
San  Diego,  California,  after  taking  the  Electronics  Techni- 
cian Selection  Test  (ETST)  were  used  as  the  source  data  of 
this  study.  The  validation  sample  consisted  of  the  first 
1,500  subjects  in  the  records  who  had  completed  the  course 
of  instruction  and  been  assigned  a  final  school  grade.  The 
cross-validation  sample  was  composed  of  the  next  750  sub- 
jects who  met  the  completion  and  final-grade  assignment 
requirements. 

The  ETST  is  made  up  of  three  parts  totaling  70  items. 
Part  I  consists  of  20  items  designed  to  test  the  subject  in 
the  area  of  mathematics.  Part  II  is  of  20-item  length  also 
and  is  related  to  science.  Part  III  consists  of  items  di- 
rected at  testing  knowledge  in  the  area  of  electricity  and 
radio  and  has  30  items  in  it. 

Each  of  the  items  on  the  ETST  was  treated  as  a  predictor 
variable  to  be  compared  with  the  criterion  of  final  school 
grade  at  the  Electronics  Technician  School, 

The  computer  programs  used  in  this  study  were  written  in 
the  FORTRAN  language  and  run  on  the  IBM  360  computer  at  the 
U,  S.  Naval  Postgraduate  School,  Monterey,  California,  The 
INTEGER*2  numbering  convention  was  used  where  possible  in 
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programing  to  conserve  core  storage  area.  The  increased 
time  involved  in  running  the  program  with  the  use  of  this 
convention  was  not  considered  critical  for  this  study, 

B,  DATA  CONVERSION 

The  program  to  select  items  for  pattern  analysis  was  de- 
veloped on  the  premise  that  all  items  of  the  set  being 
considered  could  be  expressed  in  the  form  of  a  "yes-no"  or 
"correct-incorrect"  answer.  This  simplified  the  programing 
by  allowing  the  item  responses  to  be  handled  in  a  binary 
form. 

The  conversion  of  the  raw  data  was  not  suitable  to  a 
manual  method  of  handling  because  over  168,000  responses 
required  coding.  The  conversion  was  done  by  using  the 
conversion  program  shown  in  the  COMPUTER  PROGRAMS  sec- 
tion(p,  31).  This  program  facilitated  the  handling  of  the 
large  volume  of  information.  Most  of  this  program  is  unique 
to  the  situation  imposed  on  the  author  by  the  form  of  the 
data  available.  However,  the  comments  contained  within  this 
program  provide  a  guideline  to  the  steps  required  in  con- 
verting data  regardless  of  the  nature  of  the  data, 

C.  PAIN 

The  author  desired  to  develop  a  computer  program,  which 
was  to  be  called  PAIN  (Pattern  Analysis  Item  Numinator), 
that  would  select  a  subset  of  items  from  the  ETST,  SEQUIN 

could  already  select  a  subset  of  items  from  the  ETST  but  in 
a  way  different  from  that  proposed  by  PAIN,  PAIN  was  based 


on  the  belief  that  the  pattern  of  responses  could  contrib- 
ute more  to  the  overall  value  of  a  predictor  than  was 
presently  being  obtained  through  the  use  of  SEQUIN  or  any 
other  method.  To  do  this  it  was  necessary  for  PAIN  to  be 
able  to  assign  scores  to  each  of  the  possible  response  pat- 
terns associated  with  a  subset  of  items.  The  score  assigned 
to  a  pattern  in  pattern  analysis  is  the  mesm  score  of  all 
subjects  in  a  sample  who  have  that  pattern.  Once  a  correla- 
tion coefficient  was  determined  for  a  given  subset,  it 
would  then  be  necessary  to  compare  this  coefficient  with 
that  obtained  through  the  examination  of  every  other  subset 
of  the  same  size  available  from  the  main  set.  This  was  im- 
practical for  reasons  which  will  be  explained  and  an 
alternate  approach  was  necessary  if  PAIN  was  to  be  used. 

The  number  of  different  subsets  of  N  items  that  can  be 
formed  by  a  70-item  set  is  expressed  as  the  combination  of 
70  items  taken  N  at  a  time.  This  meant  that  to  investigate 
a  subset  as  small  as  three  items  in  size  would  have  in- 
volved the  examination  of  5^1 7^0  possible  subsets,  each  of 
which  contained  eight  patterns  of  response.  From  the  infor- 
mation available  concerning  SEQUIN  it  appeared  that  a 
subset  of  seven  items  would  be  necessary,  at  the  least,  if 
improvement  was  desired  over  the  methods  presently  avail- 
able, 

A  seven-item  subset  would  allow  the  1,500  subjects  of 
the  validation  sample  to  be  placed  in  the  128  response  pat- 
terns involved  with  an  average  distribution  of  slightly 
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less  than  12  subjects  per  pattern.  This  number  was  felt  to 
be  sufficient  to  establish  a  fairly  stable  mean  score  for 
each  pattern,  A  second  advantage  of  using  seven  items  in 
the  subset  was  that  it  would  allow  ready  comparison  with 
the  work  of  Lieutenant  K.  Weinberg( personal  communication). 
Lieutenant  Weinberg  had  used  the  same  raw  data  to  investi- 
gate the  validity  of  the  seven  items  from  the  ETST  selected 
as  the  best  predictors  by  SEQUIN,  Unless  a  reasonable  al- 
ternative to  the  examination  of  all  possible  response 
patterns  was  taken,  however,  this  would  have  meant  the  in- 
vestigation of  over  77  trillion  patterns,  a  job  that  was 
beyond  even  a  computer  approach.  This  was  just  for  the  se- 
lection of  the  seventh  item  of  the  subset! 

In  order  to  overcome  the  problem  of  size,  the  assumption 
was  made  that  once  an  item  had  been  selected  as  the  best 
for  a  subset  of  given  size,  it  would  continue  to  be  a  part 
of  any  larger  subset.  This  allowed  the  author  to  say  that 
the  item  selected  as  the  best  item  for  the  subset  of  one 
item  would  be  a  part  of  the  subset  of  two  items,  both  of 
which  would  be  part  of  the  subset  of  three  items,  etc.  This 
same  approach  is  used  in  both  stepwise  regression  and 
SEQUIN,  and  would  reduce  the  selection  process  for  the 
seventh  item  to  an  examination  of  slightly  over  8,000  pat- 
terns c  After  PAIN  was  operative,  tests  were  made  to 
determine  the  effects  of  the  item-retention  assumption  on 
the  overall  validity  of  the  solution. 
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To  test  the  item-retention  assumption,  subsets  of  two 
items  each  were  selected  randomly  from  each  of  the  three 
parts  of  the  ETST  to  act  as  the  two-item  subset  in  the  PAIN 
program.  The  program  was  then  allowed  to  select  the  items 
for  the  completion  of  the  subsets  of  six-item  size.  The  va- 
lidities of  these  subsets  were  then  compared  with  the 
validity  associated  with  the  selection,  by  PAIN,  of  all  the 
items  in  a  subset.  The  two  forced  items  were  selected  from 
individual  parts  of  the  ETST  rather  than  the  total  ETST  be- 
cause the  results  of  the  unrestricted  selection  by  PAIN 
indicated  that  certain  sections  of  the  ETST  were  more  valid 
than  other  sections, 

PAIN  operated  by  computing  mean  criterion  scores  for  each 
pattern  of  responses  in  a  given  subset,  assigning  these 
scores  to  subjects  having  that  pattern  of  responses,  and 
correlating  assigned  scores  with  the  subjects'  final  school 
grades,  PAIN  provided  the  following  information  when  runi 

1.  Validities  of  all  subsets  examined, 

2.  A  list  of  the  items  that  form  the  most  valid  subset 
of  a  given  size, 

3.  The  validity  of  the  most  valid  subset  of  each  size. 
The  final  form  of  PAIN  is  contained  in  the  COMPUTER 

PROGRAMS  section(p,  3^ ) •  Representative  run  times  and  core 
storage  areas  for  this  program  on  the  IBM  3^0  computer  are 
contained  in  Table  2  (p,  22),  Details  on  the  roles  of  im- 
portant variables  and  how  this  program  can  be  adapted  for 
general  use  are  contained  in  APPENDIX  A, 
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D.  CROSS-VALIDATION 

That  program  which  the  author  calls  "cross-validation" 
is  in  fact  a  combination  of  two  separate  programs.  The 
first  section  of  the  cross-validation  program  was  written 
to  obtain  mean  scores  for  patterns  of  responses  to  items 
selected  from  the  validation  sample.  This  was  done  in  the 
validation  program  but  could  not  be  output  because  it  was 
not  known  while  the  program  was  running  which  subset  would 
eventually  be  wanted.  Since  the  score  for  each  pattern  of 
response  changed  whenever  a  new  item  was  examined,  it  would 
have  been  necessary  to  store  all  scores  for  each  pattern  in 
the  computer  until  the  best  item  for  inclusion  in  the  sub- 
set was  found,  or  to  print  out  the  pattern  values  of  all 
subsets  examined.  On  the  other  hand,  the  process  of  obtain- 
ing a  mean  score  for  each  pattern  was  relatively  easy  and 
quick  once  all  of  the  items  in  the  subset  wera  known. 

The  second  part  of  the  cross-validation  program  did  in 
fact  perform  cross-validation.  The  program  assigned  the 
mean  pattern  scores  from  the  validation  sample  to  subjects 
having  the  same  response  patterns  in  the  cross-validation 
sample  and  then  correlated  these  scores  with  the  final 
school  grades  of  the  750  subjects  in  this  sample. 

The  fact  that  all  patterns  may  not  have  been  assigned 
scores  in  the  validation  sample  was  handled  by  eliminating 
subjects  from  the  cross-validation  sample  who  had  patterns 
that  had  not  been  assigned  scores.  This  procedure  was 
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considered  acceptable  because  of  the  small  number (eight)  of 
subjects  who  fell  into  this  catagory  for  a  seven-item  sub- 
set. 

The  cross-validation  program  provided  the  following  in- 
formation, given  the  items  that  form  the  subset i 

1,  A  coded  identification  of  the  pattern  of  responses. 
APPENDIX  B  explains  how  to  construct  the  patterns  from 
the  code. 

2.  The  mean  score  for  each  pattern  encountered  in  the 
validation  sample, 

3«  An  indication  of  which  patterns  of  response  were  not 

encountered  in  the  validation  sample. 

k.   The  validity  of  the  validation  sample, 

5,  The  validity  in  cross-validation. 

6,  The  number  of  subjects  eliminated  from  the  cross- 
validation  sample  because  their  patterns  were  not  scored 
in  validation. 

The  cross-validation  program  was  also  used  to  investi- 
gate what  improvement  in  validity  was  obtainable  through 
the  use  of  PAIN  over  a  random  selection  of  a  subset  of  . 
items  to  use  in  pattern  analysis. 

The  final  form  of  the  cross-validation  program  is  con- 
tained in  the  COr/IPUTER  PROGRALIS  section(p.  31).  For  a 
seven-item  subset  this  program  had  a  running  time  of  ap- 
proximately ten  seconds  on  the  IBM  j60   computer  and  used  a 
core  storage  area  of  approximately  80K,  APPENDIX  A  explains 
in  detail  how  this  program  can  be  adapted  for  general  use. 
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IV.  RESULTS 

PAIN  selected  items  20, l^.^O, 56,7 f5f  and  33  in  that  or- 
der as  the  most  predictive  seven-item  subset  of  the  ETST. 
The  validity  of  the  seven  items  was  ,828  in  validation  for 
1,500  subjects  and  .778  for  the  cross-validation  of  750 
subjects. 

By  using  the  pattern-analysis  technique  to  score  the 
subset  of  items  selected  by  PAIN,  it  was  possible  to  exceed 
the  validity  that  had  previously  been  attached  to  the  ETST 
as  a  predictor  of  final  school  grades.  The  Navy  had  deter- 
mined the  predictive  validity  of  the  ETST  to  be 
approximately  ,61  using  the  total  number  of  all  70  items 
correct  as  the  predictor,  A  subset  of  as  few  as  three  items 
selected  by  PAIN  was  capable  of  establishing  a  predictive 
validity  of  ,^6   in  cross-validation. 

The  cross-validation  results  for  PAIN-selected  items 
was  an  improvement  over  the  ,72  value  of  validity  obtained 
in  cross-validation  by  Lieutenant  Weinberg  in  his  study  of 
sequin's  seven  best  items  for  predicting  final  school 
grades. 

In  the  15  cases  investigated,  the  random  selection  of 
the  first  two  items  of  the  subset  did  not  improve  the  va- 
lidity of  any  six-item  subsets  (See  Table  3).  The  items 
selected  for  the  six-item  subset  under  these  conditions 
consistently  included  items  selected  by  PAIN  when  no  con- 
straints were  placed  on  the  selection  process.  Eight 
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subsets  selected  only  one  item  each  that  had  not  appeared 
in  the  unconstrained  solution.  Item  number  16  was  the  only 
"new"  item  to  appear  in  a  six-item  subset  more  than  once, 
and  it  appeared  in  six  different  subsets. 

Random  selection  of  seven  items  for  pattern  analysis 
also  consistently  resulted  in  lower  validities  than  those 
obtained  through  the  use  of  PAIN  (See  Table  k) , 

The  author  attempted  using  PAIN  to  select  more  than 
seven  items  from  the  ETST  in  order  to  determine  at  what 
point,  if  any,  the  validities  in  validation  and/or 
cross-validation  would  level  off  or  decline.  At  the 
ten-item-subset  level,  which  was  near  the  program  size  lim- 
it imposed  by  the  computer  system's  core  storage  capacity, 
the  validity  was  still  increasing  for  the  validation 
sample.  The  cross-validation  sample,  on  the  other  hand,  did 
show  a  decline  in  predictive  validity  at  the  ten-item  level 
(See  Figure  1,  p.  1?  and  Table  1), 

Examination  of  the  assigned  scores  associated  with  the 
128  patterns  representing  the  best  seven-item  subset  indi- 
cated that  score  assignments  were  not  always  directly 
related  to  the  number  of  items  correct  in  the  subset.  In 
some  cases  four  correct  items  in  the  subset  were  assigned 
a  lower  score  than  three  correct  items.  Also,  the  score  as- 
signed to  getting  only  a  certain  item  correct  in  a  subset 
of  one  size  was  not  always  the  same  as  the  score  assigned 
to  getting  only  that  item  correct  in  a  different  size 
subset. 
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FIGURE  1 


CORRELATION  COEFFICIENTS  IN  VALIDATION  AND  CROSS-VALIDATION 
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Note  I  See  Table  1  for  exact  values  of  validity  coefficients 
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V.  CONCLUSIONS 

This  study  has  in  part  confirmed  the  value  of  using 
pattern  analysis  as  a  predictive  technique.  While  a  method 
such  as  stepwise  regression  would  have  assigned  a  value  to 
each  of  the  items  in  the  subset  eventually  selected  by 
PAIN,   the  pattern-analysis  approach  allowed  for  the  change 
in  value  of  each  of  these  items  when  used  in  different  com- 
binations. It  appears  that  PAIN  obtained  a  higher  validity 
than  SEQUIN  or  stepwise-regression  techniques  (M  ^112 
student  projects  in  progress  concurrent  with  this  writing) 
because  more  complete  use  was  made  of  the  information  a- 
vailable  when  a  pattern-analysis  approach  to  evaluation  was 
used, 

PAIN  and  the  scoring  section  of  the  cross-validation 
program  have  provided  a  predictor  of  Electronics  Technician 
School  final  grades  superior  to  any  other  known  to  the 
author  at  the  time  of  this  writing.  By  using  this  technique 
of  item  selection  on  other  tests,  a  series  of  short,  highly 
predictive  tests  for  other  areas  requiring  evaluation  could 
be  formed,  A  word  of  caution  is  appropriate  though.  The 
subjects  of  this  study  answered  the  questions  used  to  form 
pain's  seven-item  subset  while  taking  the  70-item  ETST,  The 
effect  on  the  results  of  taking  a  seven-item  versus  70-item 
test  was  not  known  at  the  time  of  this  writing.  Until  it  is 
determined  that  the  shortness  of  the  test  is  without  ad- 
verse effect,  the  full  implementation  of  testing  based  on 

subsets  cannot  proceed, 
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The  consistent  appearance  of  certain  items  in  subsets 
formed  with  and  without  constraints  on  PAIN,  coupled  with 
the  lower  validities  resulting  when  PAIN  was  constrained, 
would  seem  to  indicate  that  the  process  of  retaining  items 
previously  selected  does  not  reduce  the  overall  validity  of 
a  subset.  Although  the  solution  obtained  by  this  method  may 
not  be  a  true  optimal  solution,  it  is  questionable  how  much 
can  be  gained  by  an  attempt  at  examining  all  possible  solu- 
tions. 

The  author  would  have  preferred  to  use  a  much  larger 
sample  than  that  used  so  that  a  much  closer  examination 
could  have  been  made  of  the  point  at  which  subset  size  in- 
creases lack  value.  It  is  probable  that  the  decline  in 
validity  in  cross-validation  at  the  ten-item  level  experi- 
enced in  this  study  was  a  result  of  having  less  than  two 
subjects  available  for  each  pattern  of  response.  In  fact, 
at  the  ten-item  level  over  13  percent  of  the  cross- 
validation  sample  was  unusable  because  of  the  lack  of  a 
scored  pattern.  The  problem  of  availability  and  a  desire  on 
the  author's  part  to  avoid  the  possible  problems  associated 
with  taking  subjects  who  received  final  school  grades  based 
on  differing  grading  systems  caused  the  restiction  on  the 
size  of  the  samples  used. 

A  key  area  for  more  investigation  is  that  of  selecting 
predictors  based  on  biographical  information.  Preliminary 
studies  by  others  who  have  used  PAIN  (concurrent  MN  ^112 
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students)  indicate  that  this  technique  of  selection  may 
have  great  value  as  a  method  of  analyzing  "biographical  data 
in  relation  to  various  criteria.  This  would  seem  logical  if 
one  will  agree  that  patterns  of  information  play  a  more  im- 
portant role  in  the  area  of  biographical  information  than 
in  the  area  of  testing.  APPENDIX  C  gives  an  explanation  of 
how  some  biographical  information  can  be  converted  into  the 
binary  form  necessary  for  PAIN.  APPENDIX  C  also  contains 
examples  of  some  of  the  preliminary  results  obtained  by  us- 
ing PAIN  in  conjunction  with  biographical  information. 
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TABLE  1 


CORRELATION  COEFFICIENTS  FOR  SUBSETS  OF  THE  ETST 

IN 
VALIDATION  AND  CROSS-VALIDATION 


NUI.IBER  OF  ITEI/IS 
IN  SUBSET 

VALIDATION 

CROSS-VALIDATION 

1 

.51^56 

.50002 

2 

.62723 

.60086 

3 

.69781 

.66228 

k 

.7^03^ 

.709^0 

5 

.77573 

.72^20 

6 

.80276 

.73857 

7 

.828/4.3 

.77829- 

8 

.85161 

.78925 

9 

.87731 

.81201 

10 

.903^6 

.80213 
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TABLE  2 


PAIN  PROGRAM  RUNNING  TII-IES  AND  CORE  STORAGE  REQUIREMENTS 

FOR 
VARIOUS  SIZE  SUBSETS 


NUMBER   OF   ITELIS 
IN   SUBSET 

APPR0Xir4/VTE 
RUNNING   Tirffi 

APPROXIMATE 
CORE  STORAGE  REQUIRED 

7 

6  Min, 

270  K 

9 

9  Min, 

310  K 

10 

13  Min. 

360  K 

Note  J  Figures  presented  are  based  on  evaluating  a  70-itein 
test  from  a  sample  of  1,500  subjects.  Running  times  are  for 
IBM  360  computer. 
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TABLE  3 


PAIN  RESULTS  WITH  CONSTRAINTS  PLACED 

ON  THE 
FIRST  TWO  ITEMS  OF  A  SUBSET 


TWO  ITEMS 
CONSTRAINED 

ADDITIONAL  ITEI.IS 
SELECTED 

SUBSET  VALIDITY 

17.  3 

40.1^.56,7 

.79603 

6.  16 

40,56.14.21* 

.79058 

19.  6 

40.14.56.8* 

.79074 

1.  20 

40,14.56.7 

.79789 

i^.  2 

20.40.56.7 

.78515 

39.  21^ 

20,14.56.16* 

.78450 

37.  21 

16*. 14, 40. 56 

.79395 

36.  30 

20,14,40.7 

.77809 

32.  29 

14,40.7,56 

.77285 

3^.  23 

20. 14. 40. 16* 

.77978 

55.  1^2 

16*. 14.40.7 

.78773 

5^.  67 

14.40.7.56 

.76546 

^7.  63 

14.40.7.56 

.78603 

56.  48 

16*. 14, 40, 7 

.79528 

69.  53 

16*. 40. 14.20 

.77085 

Note  I  Additional  items  selected  are  presented  in  the  order 
of  selection  by  PAIN.  The  seven  items  originally  selected 
by  PAIN  were  20,14,40,56,7,5,  and  33  in  that  order. 
♦Indicates  item  not  originally  selected  by  PAIN 
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TABLE  ^' 


VALIDATION  AND  CROSS-VALIDATION  RESULTS 

OF 

PATTERN  ANALYSIS  TECHNIQUE 

USED  ON 

RANDOMLY  SELECTED  SUBSETS  OF  SEVEN  ITEMS 


RANDOMLY  SELECTED 
ITEMS 

VALIDATION 

CROSS-VALIDATION 

12, 14, 38, M, 42, i^8, 56 

.75235 

.67070 

4,40,58,63,66,67,70 

.66504 

.59745 

15.23.40,48,54,58,59 

.67265 

.57414 

4.10,17,26,34,45,55 

.71356 

.69899 

6,14,41,42,53,56,66 

.73571 

.66944 

Notet  PAIN  validation  and  cross-validation  results  for 
seven  items  were  .82843  and  .77829  respectively. 
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APPENDIX  A 

PAIN  AND  CROSS-VALIDATION  PROGRAT^  CONVERSION 
FOR  GENERAL  USE 

The  PAIN  and  cross-validation  programs  in  this  study 
can  be  used  to  process  any  data  of  a  binary  nature  simply 
by  altering  the  contents  of  the  DII»IENSION  and  DATA  state- 
ments and  insuring  that  the  READ  statement  conforms  to  the 
device  from  which  the  data  is  being  read.  This  APPENDIX  is 
a  detailed  check  list  of  how  the  DATA  and  DIMENSION  state- 
ments should  be  set  up  by  the  user. 

A.  PAIN  DATA  STATEMENT 

1,  Nl  is  set  equal  to  the  size  of  the  sample  being  used 
in  validation, 

2,  N2  is  set  equal  to  one  (1),  The  program  will  handle 
increasing  this  variable  to  conform  to  the  size  of  the  sub- 
set under  consideration, . 

3,  N3  is  set  at  a  value  equal  to  or  greater  than  the  in- 
teger range  of  the  criterion  scores.  If  the  criterion 
scores  are  not  in  an  integer  form,  conversion  most  be  made 
before  the  data  are  read  into  the  program  so  that  the  ma- 
trix involved  can  be  addressed,  A  Data  Conversion  program 
can  be  altered  to  do  this  if  such  a  program  is  used. 
Example  J  A  3*^5   criterion  score  can  be  converted  to  a  3^5 
criterion  score.  If  conversion  were  not  made  in  advance, 
the  program  would  truncate  this  criterion  score  to  3,  See 
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A7  in  this  Appendix  for  further  details.  The  alternative  to 
using  integer  criterion  values  would  require  a  degree  of 
manipulation  within  PAIN  that  is  unwarranted  for  most 
cases, 

^.  N^  is  set  equal  to  two  (2),  The  program  will  handle 
the  increasing  of  this  variable  to  conform  to  the  number  of 
patterns  within  a  given  subset  size, 

5,  N5  is  set  equal  to  the  final  number  of  items  desired 
in  the  subset, 

6,  n6  is  set  equal  to  the  total  number  of  items  in  the 
set  under  investigation, 

7,  INDEX  is  equal  to  a  value  one  less  than  the  lower 
number  used  in  determining  the  range  of  the  criterion  (N3). 
Example:  If  the  criterion  were  student  grades  on  a  ^,0 
grading  scale  and  the  investigator  did  not  know  the  actual 
value  of  the  lowest  grade  in  the  sample,  but  knew  the  low- 
est grade  was  at  least  higher  than  l,5f  the  value  of  N3 
could  be  set  as  low  as  ^0-15,  or  25,  and  the  value  of  INDEX 
would  be  15-1 f  or  1^,  Note  that  if  the  lowest  value  for  a 
final  grade  had  actually  been  1,5  instead  of  higher  than 
1.5  the  conversion  would  have  been  40-14,  or  26,  for  the 
value  of  N3  and  14-1,  or  I3,  for  the  value  of  INDEX.  Al- 
though PAIN  can  be  rion  without  going  through  the  process  of 
assigning  a  value  to  INDEX,  in  many  cases  the  core  storage 
and  running  time  of  the  program  can  be  greatly  reduced  by 
using  the  INDEX  variable. 
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B.  CROSS-VALIDATION  DATA  STATEIffiNT 

1,  In  order  to  perform  cross-validation  both  the  vali- 
dation and  cross-validation  data  must  be  read  into  the 
program  respectively. 

2,  Nl,  N3,  and  INDEX  follow  the  same  rules  as  in  the 
PAIN  DATA  statement, 

3,  N2  is  set  equal  to  the  number  of  items  in  the  subset 
being  cross-validated  or  for  which  mean  pattern  scores  are 
desired, 

^,  N^  is  set  equal  to  the  number  of  patterns  associated 
with  the  subset  size  being  cross-validated.  This  value  is 
equal  to  2  to  the  N2  power, 

5,  N?  is  set  equal  to  the  number  of  subjects  in  the 
cross-validation  sample.  Setting  this  value  to  zero  (0) 
results  in  processing  only  the  mean  pattern  scores  for  the 
validation  sample, 

C.  DIMENSION  STATEMENTS 

1,  The  variables  in  the  DIMENSION  statements  are  dimen- 
sioned according  to  the  comments  at  the  beginning  of  PAIN 
and  the  cross-validation  programs. 

2,  Definitions  of  the  variables  involved  in  the 
DIMENSION  statements  are  contained  in  the  list  of  non-dummy 
variables  preceding  the  computer  programs  in  the  COMPUTER 
PROGRAMS  section(p,  31), 
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APPENDIX  B 

INTERPRETING  PATTERN  CODES 

All  patterns  of  response  were  converted  from  binary 
to  decimal  form  during  PAIN  and  the  cross-validation  pro- 
grams so  that  matrix  addresses  could  be  used.  Hence,  the 
list  of  patterns  printed  as  output  to  the  cross-validation 
program  is  in  decimal  form.  The  user  of  this  program  sim- 
ply needs  to  subtract  one  from  the  pattern  number  in  the 
output  to  obtain  the  actual  decimal  equivalent  of  the  bi- 
nary number  of  the  pattern  referenced.  For  example,  the 
cross-validation  program  assigned  a  mean  pattern  score  of 
73 •O  to  the  pattern  listed  as  number  58.  This  would  convert 
to  the  decimal  equivalent  SI  %   which  yields  the  binary  pat- 
tern 0111001,  Since  the  items  of  the  subset  used  were  read 
into  the  cross-validation  program  in  the  order  5,7,1^,20, 
33 f ^0,56,  the  pattern  would  indicate  that  a  "correct"  or 
"yes"  answer  to  items  7il^f20,  and  S^   coupled  with  an  "in- 
correct" or  "no"  answer  to  items  number  5,33,  and  ^0 
predict  a  criterion  score  of  73.0, 
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APPENDIX  C 
BIOGRAPHICAL  INFORMATION  AND  PAIN 

To  convert  biographical  information  into  the  binary 
form  necessary  for  use  in  PAIN  requires  questions  to  be 
formulated  such  that  answers  can  be  expressed  in  a  yes-no 
form. 

Questions  that  at  first  do  not  appear  to  fit  a  yes-no 
format  are  already  being  sectioned  into  parts  so  that  that 
format  can  be  used.  For  example,  the  question,  "How  old  are 
you?",  can  be  handled  in  the  following  way  and  often  isi 
How  old  are  you?  Check  one  box 

1.  under  21  I   I 

2.  21  -  30  [3 

3.  31-^0  □ 
k,   over  ^0.  I   I 

The  boxes  without  checks  can  be  considered  as  "no"  answers 
in  this  example  with  the  checked  box  a  "yes".  The  one  ques- 
tion, "How  old  are  you?",  can  now  be  handled  by  PAIN  as 
four  separate  items.  If  PAIN  should  indicate  that  one  or 
more  of  these  items  represent  good  predictors,  those  items 
could  be  further  sectioned  for  further  evaluation.  Other 
types  of  biographical  information  can  also  be  handled  in 
this  manner. 

The  author  knows  of  at  least  two  studies,  that  were  be- 
ing conducted  by  students  at  the  U.  S.  Naval  Postgraduate 
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School  at  the  time  of  this  writing,  which  involved  the  use 
of  PAIN  for  selecting  items  of  a  biographical  nature  for 
use  in  predicting  various  criteria,  A  study  using  the  final 
QPA's  of  students  in  the  Masters  program  at  the  U.  S.  Naval 
Postgraduate  School  as  the  criterion  has  yielded  encourag- 
ing preliminary  results.  While  samples  were  too  small  to 
justify  comparison  in  cross-validation,  PAIN  provided  uni- 
formly higher  validities  than  stepwise  regression  in 
validation. 

The  second  study  involved  predicting  drug  addiction. 
This  study  had  found  a  four-item  subset  that  had  a  validity 
of  over  ,60  in  validation.  No  cross-validation  results  were 
available  at  the  time  of  this  writing,  but  any  reasonable 
retention  of  validity  in  cross-validation  could  provide  an 
extremely  useful  predictor  in  this  field. 
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COMPUTER  PROGRAMS 
LIST  OF  NON-DUMT-IY  VARIABLES  USED  IN  COf^UTER  PROGRAI/xS 

ANS(I)  -  correct  response  to  item  I  of  test 

B(J)   -  .ith  subject's  assigned  decimal  value  for  his 
binary  pattern  of  responxes 

C(J)   -  .ith  subject's  final  school  grade  used  for  the 
criterion 

CI     -  sum  of  the  criterion  scores 

C2     -  sum  of  the  squares  of  the  criterion  scores 

D(J)   ~  .ith  subject's  identification  number 

F(M,N)  -  the  joint  frequency  distribution  of  patterns 
versus  criterion  scores 

INDEX  -  a  value  equal  to  the  lowest  score  used  in 
determining  range  (N3)  minus  1 

K      -  as  used  in  the  conversion  program  only,  the  number 
of  the  data  card  on  which  information  is  stored 

M      -  row  in  the  "F"  matrix  representing  number  of  a 
pattern  in  a  given  subset 

N      -  column  in  the  "F"  matrix  representing  the 
criterion  score 

Nl  -  the  size  of  the  sample  used  in  validation 

N2  -  number  of  items  in  the  subset  being  considered 

N3  -  coded  range  of  the  criterion  scores 

N4  -  number  of  patterns  in  the  subset  being  considered 

N5  -  size  of  subset  desired 

N6  -  total  number  of  items  in  set  being  investigated 

N?  -  the  size  of  the  sample  used  in  cross-validation 

P(I,J)  -  .ith  subject's  binary  response  to  item  i 


31 


R2  -  the  correlation  coefficient  determined  from  the 
use  of  raw  scores 

S(I)  -  the  mean  pattern  score  for  pattern  i 

51  -  sum  of  the  criterion  scores  for  a  given  pattern 

52  -  sum  of  the  subjects  with  a  given  pattern 
W(I)  -  answer  given  "by  subject  to  item  i  of  test 
X(J)  -  jth  subject's  mean  pattern  score 

XI  -  sum  of  mean  pattern  scores 

X2  -  sum  of  squares  of  mean  pattern  scores 
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DATA  CONVERSION  PROGRAM 


C  THIS  PROGRAM  EDITS  I NF CRM AT  ION  ABOUT  SUBJECTS  WHO  HAVE 
C  TAKEN  The  ETST,  CONVERTS  TEST  DATA  TO  A  BINARY  FCRM  SUCH 
C  THAT  A  CORRECT  ANSWER  IS  ASSIGNED  THE  VALUE  '1'  AND  AN 
C  INCORRECT  ANSWER  IS  ASSIGNED  THE  VALLE  '0'.  THIS 
C  I^FCR^'ATION  IS  THEN  TRANSFERED  TO  A  DATA  CELL 
C 

IMPLICIT  INTEGER^ACA-Z) 

DIM5NSICN  P(70) , ANS(70),W(70) 

DATA  NR E AD, NWRITE,N PUNCH/8 ,9,7/ 
C 

C  THE  CCRRECT  ANSWERS  TO  ALL  ETST  QUESTIONS  ARE  READ  IN 
C 

READ  (5,1)  (ANS( I ) , 1=1,70) 
1  FCPMAT  (7011) 
C 

C  INFORMATION  ON  EACH  SUBJECT  IS  READ  IN 
C 


c 


C 

C  CONVERT  EACH  SUBJECTS  ETST  ANSWERS  TO  BINARY  FCRM 
C 

DO  20  1^1,70 

IF(W( I)  .NE.ANS( I  )  )  P(  I  )  =  0 

IF(W(I)  .EQ.ANSd  )  )  F(I)  =  1 
20  CGNTINUe 

c 

C  OUTPUT  THE  EDITED  AND  CCNVERTED  INFORMATION 


IF(J.GT  .44)  GO  TO  83 

GC  TO  84 
82  IF(J.GT.150)  GO  TO  84 

t\PITE  (6,85)  J, C,  (F(I  ),  1=1,70) 
85  FORMAT  ( I5,5X,I3,5X,7CI  1) 
84  WRITE  (NWRITE,8)  C , C, ( P ( I ) ,  1  =  1 , 7C ) 

8  FCRMAT  (16,12,7011) 
IOC  CCNTINUE 

50  WRITE  (6,9)  D, C, ( P ( I ) , 1= 1 ,70 ) 

9  FCRMAT  (I8,5X,I2,5X,7CI1) 
STOP 

END 
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PAINt  ITEM  SELECTION  PROGRAI.i 


C  THIS  PRCGRAM  SELECTS  A  SLBSET  GF  ITENS  THAT  MAXIMIZE 
C  VALIDITY  UNDER  PATTERN  ANALYSIS.  THE  VALUES  ASSIGNED  TG 
C  THE  DIMENSIONED  VARIABLES  ARE:  E(M),  C(M)t  C(N1), 
C  F(2**N5,N3),  P(N6,N1)  ,S(2*=!=N5)  ,X(NI)  ,  ITEM(N5) 
C 

IMEGEP*2  E,C,F,P 

INTEGER  C1,C2,R3,S1,S2 

LIKENS  I CN  B(L50  0),C(150C),D(1500),F(128,47)  , 
2P(7C,15C0),S(123)tX(i5  00)TlTEN.(7) 

DATA  Nl»N2tN3,N4fN5,N6, INDEX/ 15 00, 1,47,2,7,70,29/ 

C 

C  THE  SUBSET  THAT  WILL  CONTAIN  THE  TEST  ITEMS  SELECTED  IS 

C  INITIALIZED  TO  ZERO 

C 

DC  11  1=1, N5 

ITEN( I)=0 
11  CCNTINUE 
C 

C  DATA  IS  READ  INTO  THE  PROGRAM 
C 

C1  =  0 

C2  =  0 

DC  13  J=1,N1 
350  READ  (9,<5)  D{  J  )  ,  C  (  J  )  ,  (  P  ( I  ,  J  )  ,  I  =  1  ,N6) 
9  FORMAT  (16,12,7011) 
C 

C  THE  FOLLOWING  TWO  IF  STATEMENTS  PREVENT  THE  CONSIDERATION 
C  OF  ANY  SUBJECT  WHO  HAS  A  CRITERION  SCORE  OUTSIDE  THE 
C  RANGE  LIMITS  USED  IN  ESTABLISHING  N3  AND  INDEX 
C 

IF(C(J)  .LT.30)  GO  TO  350 

IF(C( J)  .GT.76)  GO  TC  350 

C1=C(J)+C1 

C2=C(J)*C(J}+C2 
13    CCNTINUE 
C 

C  THIS  LCCP  CONTROLS  WHICH  ITEM  OF  THE  SUBSET  IS  BEING 
C  SELECTED  DURING  THE  CURRENT  ROUND  OF  EXAMINATICNS 
C 

CC  220  L=1,N5 

RF=0.0 
C 

C  THIS  LCCP  CCNTROLS  WHICH  ITEM  FROM  THE  TOTAL  SET  OF  ITEMS 
C  IS  BEING  CONSIDERED  FCR  EXAr'INATION 
C 

DC  200  KA=1,N6 
C 

C  THIS  LOOP  PREVENTS  CONSIDERATION  OF  AN  ITEM  ALREADY 
C  SELECTED  TO  BE  A  MEMBER  OF  THE  SUBSET 
C 

DC  14  1=1, N2 

IFCKA.EG.ITEMd  )  )  GO  TO  200 
lA    CCNTINUE 
C 

C  THE  F  NATRIX  IS  INITIALIZED  TO  ZERO 
C 


17 
12 
C 

C  THE  JOINT  FREQUENCY  DISTRIBUTION  OF  PATTERN  AND  CPITEPICN 
C  SCCRES  IS  D-ETERMINED  AND  BINARY  PATTERNS  ARE  CCNVERTED  TO 
C  DECINAL  EQUIVALENTS  TO  BE  USED  AS  ROW  ADDRESSES. 
C 


F  NATRIX 

IS 

DO  12  1=1 
CC  17  J=l 
F(I  ,J)=0 

CCNTINUE 
CCNTINUE 

,N4 
,N3 

3^ 
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C 
C 
C 
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DC  18  J=1,N1 

N=l 

K  =  N4 

DC  19  1=1, N2 

K  =  K/2 

IFdTEMd)  .EQ 

M=K-P( ITEM( I) 

CCMINUE 

K=K*P(KA, J)+K 

N=C( J)-INOEX 

F|NtN)=F{M,N) 

B(J)=M 

CCNTINUE 


.0)  GC  TC  19 


+  1 


THE  NEAN  CRITERION  SCOPES  FOR  EACH  PATTERN  APE  COMPUTED 


C 
C 

c 
c 


CC  20  1=1, N4 

S1=0 

S2  =  0 

DC  21  J  =  1,N3 

S2=F(  I, Ji+S2 

Sl=(J+INDEX)* 

21    CONTINUE 

IF(S2.EC.O)  G 
S(n=Sl/S2 
GC  TO  2C 
10  S<I)=O.C 

20    CONTINUE 


F( I, J)+S1 

0  TO  10 


MEAN  SCORES  FOR  I 
ACCORDING  TO  HIS 


ATTERNS  ARE  ASSIGNED  TO  EACH  SUBJECT 
PATTERN 


C 
C 

c 
c 


31 


DO  31  J=1,N1 

K=B{J) 

X( J)=S(KJ 

CONTINUE 


C 

c 

C 
C 
C 
C 
C 


CGPRELATICN  COEFF 
FORMED  USING  ITEM 

>1=0.0 
X2=0.0 
W=0.0 

DC    ^i    J=1,M 
X1=X(  J)+X1 
X2  =  X(J)=:^X(  J)  + 
W=C(J)-X( J)+W 
41         CONTINUE 

R1  =  (N1-'X2)-{X 
IF(Rl.EQ.G.O) 
R3=(Ni':^C2)-(C 
R5=(Nl-ly)-(Cl 
C=iRl*R3)**0. 
P2=R5/G 
GC  TO  86 
85   R2=0,0 


ICIENT  BETWEEN  CRITERION  AND  P/iTTERN 
UNDER  CONSIDERATION  IS  COMPUTED. 


X2 


1^X1) 

GO  TO 
i-Cl) 
*X1) 


85 


IT  I S  DETERMINED 
ITEM  PRESENTLY  UN 
PREVIOUSLY  FOUND 
SUBSET.  ITEM  NUMB 
STORED  IF  THEY  AR 

86   ^nRITE  (6,54) 
54  FORMAT  (•  THE 
2PREVI0USLY  SE 
IF(R2.GT.Rh) 
GC  TO  200 
61   RH=R2 
ITEMH=KA 
2CC  CONTINUE 


IF  THE  CORRELATION  COEFFICIENT  USING 
DER  CONSIDERATION  IS  HIGHER  THAN  HIGHEST 
CORRELATION  COEFFICIENT  WITH  SAME  SIZE 
ER  AND  CORRELATION  CCEFFICIENT  ARE 
E  HIGHER. 


KA,R2 

CORRELATION 
LECTEC  ITEMS 
GO  TO  81 


OF     ITEM     •,I4, 
IS'  ,F10.9) 


•     VsITH    THE 
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c 
c 
c 


PRINT  CUT  THE  ITEM  NUNBERS  GF  ITEMS 
HIGHEST  COPRELATlGN  COEFFICIENT  FOR 
CCNSICEKATICN  AND  THE  VALUE  OF  THIS 


USED  TO  GET  THE 
THE  SIZE  SLESET 
COEFFICIENT  . 


ITEM(L)=ITEMH 
VsRITE  (6,160) 


{ 


16C  FORMAT 

2*  ITEM  SUBSET  ARE: 
aCORRELATICN  CCEFFI 


L,L,  (ITEMd  ; 


'THE  BEST  SIZ 


1  =  1, N5)  ,RH 
' ITENS  TC  US 


',/'  «,  1015,  •  AND  THEY 
CIENT  GF  • ,F10.9,//) 


FOR  A 
YIELD 


UNDER 


12 


ADVANCE  THE 
OF  PATTERNS 

N2=N2+1 

N4=2*-N2 
22C  CONTINUE 

STOP 

END 


NUMBER  OF 
POSSIBLE  T 


ITEMS  IN  THE  SUBSET  AND  THE  NUMBER 
HEN  REPEAT  THE  EXAMINATICN  PROCESS 
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CROSS-VALIDATION  PROGRAItl 


//  EXEC  FORTCLG 

C  THIS  FRCGRAM  PRINTS  THE  MEAN  PATTERN  SCORES  FCR  THE 

C  SUBJECTS  IN  THE  VALIDATION  SAMPLE  ANC  CROSS -V AL  ID  AT ES  THE 

C  CRCSS-VALIDATION  SAMPLE.  THE  VALUES  ASSIGNED  TC  THE 

C  DINENSICNEu  VARIABLES  ARE:B(N1),  C ( M  I  , F ( N4, N2  )  ,  P(N2,N1). 

C  S(N4i  ,  X(N1) 

INTEGER*2  B,C,D,F,P 

INTEGER*4  CI , C2 , R3  ,  S  1 , S2 

DATA  iNll,N2  tN3,N4,  NT,  INDEX /L  500,7,  A7, 128,  7  5  0,29/ 
C 

C  DATA  IS  READ  IN  FROM  THE  INPUT  DEVICE 
C 

CC  220  L=l,2 

IF(L.Ee.2)  M=N7 

IF(M.EC.O)  GO  TO  9C0 

CC  13  J=L,NL 
350  READ  {9,9J  C ( J )  , ( P  (  I  ,  J )  ,  I  =1  ,  N2 ) 
9   FORMAT  (T7, I2,T13, I1,T15, II,T22,I  l,T28, I 1,T30,I1, 
2T48,Il,T64,  ID 
C 

C  THE  FOLLOWING  TWO  IF  STATEMENTS  PREVENT  THE  CCNS IDERAT ICN 
C  OF  ANY  SUBJECT  wHO  HAS  A  CRITERION  SCORE  OUTSIDE  THE 
C  RANGE  LIMITS  USED  IN  ESTABLISHING  N3  AND  INDEX 
C 

IF(C(J)  .LT.30)  GO  TC  350 

IF(C{ J)  .GT.76)    GO    TC    350 
13         CONTINUE 

RF=0.0 
C 

C  THE  F  MATRIX  IS  INITIALIZED  TO  ZERO 
C 

DC  12  I=1,N4 

CC  17  J=1,N3 

F(I  ,J)=0 

17  CONTINUE 
12  CONTINUE 

C 

C  THE  JCINT  FREQUENCY  DISTRIBUTION  OF  PATTERNS  ANC 

C  CRITEFION  SCORES  IS  DETERMINED  AND  BINARY  PATTERNS  ARE 

C  CCNVERTED  TC  DECIMAL  EGUIVALENTS  TO  BE  USED  AS  ROW 

C  ACCRESSES 

C 

DC  18  J=1,N1 
29   M=l 

K  =  N4 

DC  19  I=1,N2 

K  =  K/2 

^=K-P(I,J)+M 
19    CCNTINUE 

N=C(J )-INDEX 

F(N,N)=F{M,N)+1 

B(J)=M 

18  CCNTINUE 
IF(L.Eg.2)  GO  TO  200 

C 

C  THE  MEAN  CRITERION  SCORES  FOR  EACH  PATTERN  ARE  CCNPUTED 

C  ANC  PRINTED  OUT. 

C 

WRITE  (6,94) 
94  FCRMAT  ('  •,5X8'PATTERN  NUMBER ',  ICX ,' ME AN  CRITERION 
2SCCRE' ,//) 

DO  20  1=1, N4 

SI  =  0 

52=0 

DC  21  J=1,N3 

S2=F( I, J)+S2 
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S1=(J+INDEX)*F( I , J)+S1 
21    CONTINUE 

IF{S2.LT.l.O)  GO  TO  10 

S{I)=SL/S2 

URITE  (6,93)  I,S(I) 
93  FCPMAT  ('  •  ,  UX,  I3»20X,F12.5) 

GC  TO  20 
IC  S(  I)=O.C 

WRITE  (6,90}  I 
9C  FCPI^AT  (•  NO  SUBJECT  HAC  THE  FCLLCWING  PATTERN  DURING 


2*  ITEM  SELECTION 
20    CONTINUE 
2CC  CONTINUE 


PROGRAM 


14) 


C 
C 
C 
C 

c 


:  ^'EAN  SCCRES  FOR  PATTERNS  ARE  ASSIGNED  TO  EACH  SLBJECT. 

NR  =  0 

CC  31  J=1,N1 

K=B(J) 

X(J)=S(K) 

IF(S(K)  .LT.0.9)  GO  TO  91 

GG  TO  31 
9  1  NR=NR+1 

C(J)=0 
31    CONTINUE 

IF(L.Eg.l)  GO  TO  32 

V\PITE  (6,92)  NR 
92  FORMAT  (16,'  SUBJECTS  HAVE  PATTERN  SCORES  NOT 
2ENCCUNTERED  DURING  THE  PAIN  PROGRAM') 


THE  CCRRELATICN  COEFFICIENT  IN  VALIDATION  AND 
CRCSS-VAL ICAT  ION  FOR  THE  ITEMS  UNDER  CONSIDERATION  IS 
CCNPLTED  AND  PRINTED  CUT. 

32 


C1  =  0 

C2=0 

Xl=G.O 

X2=0.0 

^^  =  o.o 

DC  Al  J=1,N1 
Cl=C(J )4C1 
C2  =  C(J)=^C(  J)+C2 
X1=X( J)+X1 
X2=X( J)*X( J)+X2 
W=C(J)^X( J)+W 
41    CONTINUE  ^ 
M  =  M-NR 

Rl=(Nl=^X2)-(  Xl-Xl  ) 
R3=(N1-C2)-(C1=^C1) 
R5=(N1=^W)-(C1=^X1  ) 
C=(Rl=!=R3)**a.5 
R2=R5/Q 
81   RH=R2 

IF(L.Ew.2)  GO  TO  220 
160  FORMAT  CO',  'THE  ITEMS 

2VALIDITY  OF  » ,F10.5, ' 
22C  CONTINUE 

WRITE  (6,54)  N2,RH 
54  FORMAT  (•  THE  VALIDITY  CF 
2FCR  CROSS-VALIDATION  IS  ' 
900  S7CP 
END 


UNDER  CONSIDERATION  VIELD  A 
IN  VALIDATION' ) 


THE' 
,F  10. 


,  12, 

5 


•ITEMS  CCNSICEREC 
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